A New Algorithm for Detecting Text Line in Handwritten Documents
نویسندگان
چکیده
Curvilinear text line detection and segmentation in handwritten documents is a significant challenge for handwriting recognition. Given no prior knowledge of script, we model text line detection as an image segmentation problem by enhancing text line structure using a Gaussian window, and adopting the level set method to evolve text line boundaries. Experiments show that the proposed method achieves high accuracy for detecting text lines in both handwritten and machine printed documents with many scripts.
منابع مشابه
Segmentation of Handwritten and Printed Arabic Documents
on this paper, we proposed a new text line segmentation of handwritten and typewriting Arabic document images that uses the Outer Isothetic Cover (OIC) algorithm of a digital object. In the first step, we use this method to segment the composed document into text blocs. In the second step, for each text bloc we will extract the text lines. Finally, line text will be segmented into words or into...
متن کاملHandwritten Text Line Segmentation by Clustering with Distance Metric Learning
Separating text lines in handwritten documents remains a challenge because the text lines are often ununiformly skewed and curved. In this paper, we propose a novel text line segmentation algorithm based on Minimal Spanning Tree (MST) clustering with distance metric learning. Given a distance metric, the connected components of document image are grouped into a tree structure. Text lines are ex...
متن کاملText line segmentation in handwritten documents using Mumford-Shah model
Text line segmentation in handwritten documents is an important step in document processing. We present a new text line segmentation method based on the Mumford-Shah model. The algorithm is script independent. In addition, we use morphing to remove overlaps between neighboring text lines and connect broken ones. Experimental results show the validity of our method.
متن کاملRobust Segmentation of Unconstrained Online Handwritten Documents
A segmentation algorithm, which can detect different regions of a handwritten document such as text lines, tables and sketches will be extremely useful in a variety of applications such as retrieval, translation and genre classification. However, this task is extremely challenging for handwritten documents, which vary considerably in their structure and content. In this paper, we describe a rob...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کامل